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BRIEF  OUTLINE  OF  RESEARCH  FINDINGS 


Introduction  '  j 

In  this  research  project  we  proposed  to  "investigati'svarious  models  of  program¬ 
ming  effort  estimation  and  prediction  at  various  stages  of  the  software  development 
processor  The  project  has  led  to  several  results  and  models.  This  report  concentrates 
on  results  from  the  last  year  of  our  study  (August,  1983  -  August,  1984). 

Large-Scale  Software  Development  Models 

Large-scale  software  development  models  arc  models  that  explain  the  effort  in 
constructing  software  products  involving  a  team  of  programmers.  Each  such  model 
generally  has  as  parameters  the  following: 

S  -  size  in  thousands  of  lines  of  code 
E  -  development  effort  in  programmer-months 
D  -  development  duration  in  months. 

We  have  investigated  a  number  of  large  scale  software  development  models.  In 
th  is  document  we  report  on  three  of  the  best  known:  the  Doty  model  [Herd  77], 
Putnam  s  software  equation  [Putnam  78]  (the  basis  for  the  SLIM  model),  and 
Boehm’s  COCOMO  model  [Boehm  81]. 

The  Doty  model  has  a  simple  form  for  large-size  projects: 

E  =  5.288  xSlM7 

Putnam’s  software  equation  can  be  written  as 


E 


=  k  x 


S?_ 

[>* 


where  k  is  a  ’'technology"  constant.  Boehm’s  COCOMO  (Constructive  COst  MOdel) 
can  have  many  input  parameters  in  its  "Intermediate"  form 

E  =  a,  Sb‘  m(X) 

in  which  a ,  and  f>,  change  with  three  development  modes  and  m(X )  is  the  product  of 
fifteen  cost  driver  attributes. 

During  the  course  of  this  research  we  have  developed  at  Purdue  University  the 
COPMO  (Cooperative  Programming  MOdel)  [Thebaut  83,  84]: 

E  =  Ep(S)  +  EC(P) 


in  which  Ep(S )  is  the  contribution  to  total  effort  from  programming  a  project  of  size 
S  and  EC(P )  is  that  portion  of  total  effort  required  by  the  necessary  coordination  of 
the  individual  efforts  of  a  programming  team  with  average  team  size  P . 
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In  evaluating  models  we  use  the  metrics. 

MRE  -  the  mean  magnitude  of  relative  error,  i.e.,  the  mean  percentage  of  error 
between  a  model’s  predicted  effort  and  the  actual  recorded  effort, 

and 

PRED  (  25)  the  percentage  of  predicted  effort  values  that  fall  within  25%  of  the 
actual  recorded  effort  values. 

Using  a  database  we  have  gathered  of  189  industry,  military,  and  university  pro¬ 
jects,  we  examined  the  performance  of  each  of  the  models.  The  results  appear  in 
Table  1  below. 


Table  1. 


MRE 

PRED  (.25) 

Doty 

1.85 

21  % 

Putnam 

1.16 

6% 

COCOMO 

.79 

38% 

COPMO 

.41 

45% 

Note  that  the  performance  of  the  COCOMO  model  is  somewhat  better  than  that  of 
the  Doty  and  Putnam  models  (both  of  which  are  clearly  unacceptable).  Furthermore, 
note  that  COPMO  is  somewhat  better  still  than  COCOMO  in  terms  of  both  mean 
magnitude  of  relative  error  and  percentage  of  estimates  that  fall  within  25%  (an 
acceptable  range)  of  their  actual  effort  values. 

In  order  to  make  the  COPMO  an  even  better  predictor  of  programming  effort, 
we  recognized  that  a  single  model  may  not  be  acceptable  for  wide  ranges  of  program¬ 
ming  productivity.  This  led  to  the  Generalized  COPMO 

E  -  bj  S  +  c,  P1 5 

in  which  6,  and  c,  are  allowed  to  vary  depending  upon  the  productivity  of  the  pro¬ 
grammers  involved.  For  the  189  projects  for  which  we  have  data  this  led  us  to  the 
ten  classes  and  b{  and  c,  values  shown  in  Table  2. 


Table  2. 


LOC  /MM 

hi 

‘i 

0-85 

4.7 

3.6 

86-200 

3.0 

2.0 

201-300 

2.4 

1.5 

301-400 

2.2 

1.2 

401-500 

1.8 

1.0 

501-600 

1.6 

0.9 

601-700 

1.2 

0.8 

701-800 

1.1 

0.7 

901-1000 

0.9 

0.5 

1001- 

0.8 

0.4 

Note  that  in  Table  2  LOC  /MM  refers  to  lines  of  code  per  programmer  month. 

This  Generalized  COPMO  leads  to  a  new  entry  in  Table  1  as  shown  in  Table  3. 


Table  3. 


MRE 

PRF.D  (.25) 

Doty 

1.85 

21% 

Putnam 

1.16 

6% 

COCOMO 

38% 

COPMO 

.41 

45% 

Generalized  COPMO 

21 

75% 

Note  that  the  Generalized  COPMO  outperforms  all  other  models  (including  the 
COPMO  from  which  it  was  derived).  This  performance  of  .21  mean  magnitude  of 
relative  error  and  75%  acceptable  estimates  seems  to  us  a  step  toward  a  useful  effort 
estimator.  On  the  other  hand,  note  that  the  information  used  to  obtain  the  produc¬ 
tivity  figures  were  obtained  from  the  project  data.  It  remains  for  us  to  show  that  the 
estimates  will  be  as  good  when  o,  and  b,  are  derived  from  historical  data  alone. 

Early  Size  and  Effort  Estimation 

In  our  continuing  research  on  software  development  models  research  we  have 
seen  that  most  models  employ  size  as  one  of  the  most  important  parameters,  i.e., 

E  =  /  (SIZE  ,  others  ). 


Thus,  for  most  models  accurate  effort  estimation  relies  on  accurate  size  estimation 
and  errors  in  size  estimation  will  lead  to  errors  (often  significant  ones)  in  effort  esti¬ 
mation. 


Our  work  in  early  size  and  effort  estimation  lias  proceeded  from  the  basic  idea 
that  we  should  be  able  to  observe  (and  measure)  some  metric  early  in  the  software 
development  process  that  will  lead  to  acceptable  final  size  and  total  effort  estimates. 
Furthermore,  if  this  process  is  repeated  at  various  stages  during  the  software 
development  process,  we  should  be  able  to  refine  these  estimates  toward  greater 
accuracy  as  software  development  proceeds. 

In  our  most  recent  (and  most  successful)  in  this  area  we  employed  the  following 
ideas  and  hypotheses  [Wang  84]: 

(1)  In  our  experimental  work  our  subjects  used  an  "incremental*  strategy  of  produc¬ 
ing  the  program  routine-by-routine  (along  with  data  structures  for  the  next 
lower  level  as  discussed  below).  In  this  strategy,  each  subprogram  was  tested 
before  proceeding  to  the  next.  We  hypothesized  (and  found  it  to  be  supported 
by  our  data)  that  LOC  (the  number  of  lines  of  code  in  the  program)  evolves 
linearly  under  such  a  strategy.  T>  at  is,  the  "curve*  showing  lines  of  code  in 
place  in  the  program  increases  at  an  approximately  constant  rate  throughout 
program  development. 

(2)  We  hypothesised  that  VARS  (the  number  of  variables  used  in  a  program)  is  a 
size-related  metric.  In  our  experimental  work  a  "top-down  data-structure-first" 
development  strategy  was  used  in  which  subjects  introduced  the  main  software 
routine  and  the  data  items  for  the  next  level  in  a  recursive  manner.  During  the 
sohwarc  development  process  VARS  evolved  in  a  concave-down  manner  (as  we 
had  assumed).  That  is,  a  curve  showing  variables  in  place  in  the  program 
increases  very  rapidly  from  zero  and  later  flattens  out  as  few  variables  are  added 
in  the  latter  stages  of  software  development. 

(3)  These  evolutionary  metric  curves  form  useful  "finger  prints*  of  the  development 
process  that  can  be  used  by  the  programmer  and  manager  (in  our  case  experi¬ 
menter)  alike 

In  our  empirical  investigation  we  employed  forty-four  Computer  Science  gradu¬ 
ate  students  at  Purdue  University  in  the  summer  of  1983.  Each  was  to  construct  two 
approximately  400-linc  Pascal  programs.  Each  subject  was  allowed  a  two  to  three 
week  development  time  (25  to  35  hours  programming  time)  for  each  program 

Our  hypotheses  concerning  curve  forms  for  LOC  and  VARS  were  supported  and 
we  arrived  empirically  at  the  following  functional  forms  for  modelling  the  evolution 
of  LOC  and  VARS  : 

LOC(l)  ~=  a  x  t'2 

Note  that  the  1.2  exponent  is  very  close  to  linear  (i.e.,  1.0)  while  reflecting  a  slight 
period  of  inactivity  at  the  beginning  of  each  software  development  process. 

VARS  (l )  =  a  X  t  2 

Note  that  the  exponent  .2  suggests  a  very  concave-down  curve. 

Our  early  size  estimation  results  appear  in  Table  4. 
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Table  4 


Program 

Estimate 

MRE 

P  RED  (.25) 

1 

locvms 

.22 

68% 

locpgur 

.23 

57% 

2 

locvars 

.24 

61% 

locpgmr 

.42 

52% 

LOCpgmr  is  the  size  estimate  based  on  an  early  interview  of  each  programmer  on 

each  project.  LOCVARS  is  an  early  size  estimate  obtained  at  the  same  early  interview 
time  by  observing  the  evolution  and  current  status  of  the  metric  VARS .  Note  that 
size  estimates  based  on  VARS  outperform  the  subjects’  own  subjective  size  estimates 
for  both  programs. 

Our  early  effort  estimation  results  appear  in  Table  5. 


Table  5. 


Program 

Estimate 

MRE 

P RED  (.25) 

1 

Evars 

.24 

64% 

Erg mr 

.42 

57% 

2 

Evars 

.28 

52% 

Ergmr 

.86 

9% 

Ehgmr  >s  f^e  programmer’s  own  effort  estimate  obtained  in  the  early  interview  of 

each  programmer  on  each  project.  EVAKS  is  the  effort  estimate  obtained  at  the  same 
early  interview  time  by  observing  the  evolution  and  current  status  of  the  metric 
VARS .  Note  that  effort  estimates  based  on  VARS  outperform  (quite  a  bit)  the  sub¬ 
jects’  own  subjective  effort  estimates  for  both  programs. 

We  consider  that  this  research  has  shown  the  possibility  of  using  our  evolution 
model  of  the  interaction  between  data  structures  and  size  as  a  tool  for  early  determi¬ 
nation  of  the  final  size  and  total  effort  of  the  software  development  process.  More 
studies  toward  confirmation  must  be  performed. 
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Software  Defects 

In  the  experiment  involving  the  forty-four  subjects  at  Purdue  University  in  the 
summer  of  1983  we  considered  that  programming  ended  when  programs  ran  correctly 
on  some  standard  acceptance  test  cases.  During  testing  subjects  ran  their  programs 
against  some  additional  standard  test  cases  and  made  necessary  alterations.  We 
investigated  several  parameters  including  "time  spent  in  programming*  and  the 
"exhaustiveness  of  the  acceptance  test  cases",  but  we  found  no  relationship  other 
than  our  simplest  hypothesis  -  the  correlation  between  testing  time  and  defects 
discovered  during  testing  was  a  significant  .47. 

In  an  industrial  study  involving  the  analysis  of  1428  program  modules  written  in 
Pascal,  PL/1,  and  Assembly  language,  we  investigated  the  factors  affecting  CUD,  the 
count  of  distinct  module  defects  [Shen  84],  We  found  that  t^,  the  number  of  unique 
operands  in  a  program,  and  DE ,  the  total  number  of  decisions  (i.e.,  Boolean  expres¬ 
sions)  in  a  program  were  the  best  estimators  of  CMD .  Note  that  t|2  <s  very  strongly 
related  to  VARS  that  proved  so  successful  in  the  size  and  effort  estimation  study 
reported  above. 

Supporting  Analyzers  and  Software  Metrics  Data  Collection 

In  order  to  conduct  our  software  metrics  work  we  have  produced  software 
analyzers  -  counters  that  compute  basic  metrics  for  programs  written  in  the 
languages  Fortran,  Cobol,  Pascal,  and  C.  More  than  fifty  of  these  have  been  distri¬ 
buted  to  interested  groups  in  the  military,  industry,  and  universities. 

Our  Software  Metrics  Data  Collection  is  a  large,  comprehensive  set  of  data 
representing  nineteen  different  program  development  histories  from  military,  indus¬ 
try,  and  university  projects. 

Future  Research 

(1)  We  intend  to  continue  our  work  on  the  Generalized  Cooperative  Programming 
Model.  We  are  looking  for  an  early  way  of  determining  the  complexity  classes. 

(2)  Our  early  size  and  effort  estimation  work  will  continue.  We  need  to 

(a)  refine  our  models, 

(b)  obtain  "real  world"  verification  by  the  use  of  non-university  project  data, 

(c)  continue  to  investigate  development  strategies  for  better  estimation  and 
control,  and 

(d)  refine  the  use  of  evolution  curves  as  "finger  prints"  of  the  software  develop¬ 
ment  process. 

(3)  We  want  to  investigate  further  the  area  of  software  defects.  Immediately  we 
see  a  need  to 

(a)  refine  the  definition  of  "defects", 

(b)  determine  the  best  predictors  of  defects, 

(c)  determine  the  relationship(s)  among  effort,  complexity,  and  defects, 

(d)  determine  how  best  to  split  time  into  development  and  testing  phases,  and 

(e)  investigate  program  design  languages  and  techniques  that  "avoid*  defects. 
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